3574 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
English French German
Availability:
Freely Available
License:
Unspecified
Size:
4.5M en-de + 0.6M en-fr sentences pairs sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Shuhao Gu | WMT14 Data | /N |
Documentation:
I don't know.
Written
Corpus,
Language Type:
Bilingual
Languages:
English Mandarin Chinese
Availability:
Freely Available
License:
Creative Commons Non-Commercial 3.0 Licenses
Size:
2M sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Shuhao Gu | UM-Corpus | /N |
Documentation:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/774_Paper.pdf
Written
Corpus,
Language Type:
Bilingual
Languages:
English Mandarin Chinese
Availability:
From Data Center(s)
License:
LDC
Size:
1.25M sentence pairs sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Shuhao Gu | LDC Chinese-English Parallel Corpus | /N |
Documentation:
I don't know.
Written
Evaluation Data,
Language Type:
Multilingual
Languages:
Arabic Egyptian Arabic English South Levantine Arabic
Availability:
Freely Available
License:
Creative Commons Attribution-ShareAlike 4.0 International Public License
Size:
8988 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:The SADID Evaluation Datasets for Low-Resource Spoken Language Machine Translation of Arabic Dialects
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Wael Abid | The SADID Evaluation Datasets for Arabic Dialects | /N |
Documentation:
Yes. English. Publicly available
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC User Agreement for Non-Members
Size:
39260 sentences Production Status:
Existing-used
Use:
Language Generation
-
Paper title:Generalized Shortest-Paths Encoders for AMR-to-Text Generation
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Lisa Jin | Abstract Meaning Representation (AMR) Annotation Release 2.0 | /N |
Documentation:
None
Written
Evaluation Data,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
CreativeCommons
Size:
17907 entries Production Status:
Newly created-in progress
Use:
Evaluation/Validation
-
Paper title:Would you describe a leopard as yellow? Evaluating crowd-annotations with justified and informative disagreement
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Pia Sommerauer | ProperCR | /N |
Documentation:
Data documentation in English
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
CreativeCommons
Size:
None Production Status:
Newly created-finished
Use:
Document Classification, Text categorisation
-
Paper title:Measuring Correlation-to-Causation Exaggeration in Press Releases
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Bei Yu | Correlation-to-Causation Exaggeration in Press Releases | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
CreativeCommons
Size:
405 KByte Production Status:
Existing-used
Use:
Machine Learning
-
Paper title:Measuring Correlation-to-Causation Exaggeration in Press Releases
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Bei Yu | Causal language use in scientific literature | /N |
Documentation:
Yu, B., Li, Y. and Wang, J. (2019). Detecting Causal Language Use in Science Findings. EMNLP 2019, pages 4656–4666, Hong Kong, China, November 3–7, 2019.
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC
Size:
55112 KByte Production Status:
Existing-used
Use:
Parsing and Tagging
-
Paper title:Semi-Supervised Dependency Parsing with Arc-Factored Variational Autoencoding
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ge Wang | Penn Tree Bank | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
OpenSource
Size:
12 MByte Production Status:
Newly created-finished
Use:
Summarisation
-
Paper title:News Editorials: Towards Summarizing Long Argumentative Texts
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Shahbaz Syed | Webis-EditorialSum-2020 | /N |
Documentation:
Yes. Documentation is available in English and provided in the supplementary material alongside the corpus.




